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Abstract. A principal wishes to transact business with a multidimensional distribution of agents 
whose preferences are known only in the aggregate. Assuming a twist (= generalized Spence- 
Mirrlees single-crossing) hypothesis and that agents can choose only pure strategies, we identify a 
structural condition on the preference b(x, y) of agent type x for product type y — and on the prin- 
cipal's costs c(y) — which is necessary and sufficient for reducing the profit maximization problem 
faced by the principal to a convex program. This is a key step toward making the principal's prob- 
lem theoretically and computationally tractable; in particular, it allows us to derive uniqueness 
and stability of the principal's optimum strategy — and similarly of the strategy maximizing the 
expected welfare of the agents when the principal's profitability is constrained. We call this con- 
dition non-negative cross-curvature: it is also (i) necessary and sufficient to guarantee convexity 
of the set of b-convex functions, (ii) invariant under reparametrization of agent and/or product 
types by diffeomorphisms, and (iii) a strengthening of Ma, Trudinger and Wang's necessary and 
sufficient condition (A3w) for continuity of the correspondence between an exogenously prescribed 
distribution of agents and of products. We derive the persistence of economic effects such as the 
desirability for a monopoly to establish prices so high they effectively exclude a positive fraction 
of its potential customers, in nearly the full range of non-negatively cross-curved models. 
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1. Introduction 



The principal-agent paradigm provides a microcconomic framework for modeling non-competitive 
decision problems which must be made in the face of informational asymmetry. Such problems range 
from monopolist nonlinear pricing [46] [60] [22] [66] [3] and product line design ( "customer screening" ) 
[54] [37] [50], to optimal taxation [42] [43], labour market signalling and contract theory [58] [59] 
[2], regulation of monopolies [5] [38] [32] [28] [4] including public utilities [47] [9], and mechanism 
design [23] [39] [45]. A typical example would be the problem faced by a monopolist who wants 
to market automobiles y S Y to a population of potential buyers ("agents") x £ X. Knowing the 
preferences b(x,y) of buyer x for car y, the relative frequency dfi(x) of different buyer types in the 
population, and the cost c(y) she incurs in manufacturing car type y, the principal needs to decide 
which products (or product bundles) to manufacture and how much to charge for each of them, so 
as to maximize her profits. 

In the simplest models, e.g. [58] [54], there are only a finite number of product possibilities 
(e.g. with air conditioning, or without) and a finite number of buyer types (e.g. rich, middle-class, 
and poor); or possibly a one-dimensional continuum of product possibilities (parameterized, say, by 
quality) and of agent types (parameterized, say, by income) [42] [59] [46] [5]. Of course, real cars 
depend on more than one parameter — fuel efficiency, comfort, options, reliability, styling, handling 
and safety, to name a few — as do car shoppers, who vary in wealth, income, age, commuting 
needs, family size, personal disposition, etc. Thus realistic modeling requires multidimensional type 
spaces X C R m and Y C R" as in [39] [65] [44] [51] [7] [16]. Although such models can often be 
reduced to optimization problems in the calculus of variations [13] [6], in the absence of convexity 
they remain dauntingly difficult to analyze. Convexity — whether manifest or hidden — rules out 
critical points other than global minima, and is often the key to locating and characterizing optimal 
strategies either numerically or theoretically. The purpose of the present article is to determine 
when convexity is present, assuming the dimensions m = n of the agent and product type spaces 
coincide. 

An archetypal model was addressed by Wilson [66] , Armstrong [3] , and Rochet and Chone [50] . A 
particular example from the last of these studies makes the simplifying hypotheses X = Y = [0, oo[", 
c (y) = |y| 2 /2, and b(x,y) = (x,y). By assuming this bilinearity of buyer preferences, Rochet and 
Chone were able to show that the principal's problem can be reduced to a quadratic minimization 
over the set of non-negative convex functions — itself a convex set. Although the convexity constraint 
makes this variational problem non-standard, for buyers distributed uniformly throughout the unit 
square, they exploited a combination of theoretical and computational analysis to show a number 
of results of economic interest. Their most striking conclusion was that the profit motive alone 
leads the principal to discriminate between three different types of buyers: (i) low-end customers 
whom she will not market cars to, because — as Armstrong had already discovered — making cars 
affordable to this segment of the market would cost her too much of her mid-range and high-end 
profits; (ii) mid-range customers, whom she will encourage to choose from a one-parameter family 
of affordably-priced compromise vehicles; (iii) high-end customers, whom she will use both available 
dimensions of her product space to market expensive vehicles individually tailored to suit each 
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customer's desires. Whether or not such bunching phenomena are robust is an unanswered question 
of considerable interest which — due to their specificity to particular preference functions the 
techniques of the foregoing authors remain unable to address. The possibility of non-robustness 
was highlighted in [7] ; below we go further to suggest which specific perturbations of the preference 
function b(x,y) are most likely to yield robust results. On the other hand, our conclusions confirm 
Armstrong's assertion that what he called the desirability of exclusion is a very general phenomenon 
in the models we study (Theorem 4.7). This exclusion however, is not generic when the dimensions 
of the type and allocation spaces differ [16]: Deneckere and Severinov gave necessary and sufficient 
conditions for exclusion when (m,n) = (2, 1). 

For general preferences b(x,y), the principal's problem can be reformulated as a minimimization 
problem over the space of 6-convex functions (Definition 3.1), according to Carlier [13]. Such func- 
tions generally form a compact but non-convex set, which prevented Carlier from deducing much 
more than the existence of an optimal strategy for the principal — a result which can also be 
obtained using the method of Monteiro and Page [45]; (for related developments see Basov [7] or 
Rochet and Stole [51]). Our present purpose is to identify conditions on the agent preferences which 
guarantee convexity of this feasible set (Theorem 3.2). In the setting we choose, the conditions we 
find will actually be necessary as well as sufficient for convexity; this necessity imparts a significance 
to these conditions even if they appear unexpected or unfamiliar. If, in addition, the principal's 
manufacturing cost c(y) is 6*-convex, for b*(y,x) := b(x,y), the principal's problem becomes a 
convex program which renders it much more amenable to standard theoretical and computational 
techniques. Although the resulting problem retains the complexities of the Wilson, Armstrong, and 
Rochet and Chone's models, we are able to deduce new results which remained inaccessible until 
now, such as conditions guaranteeing uniqueness (Theorem 4.5) and stability (Corollary 4.6) of the 
principal's optimum strategy. The same considerations and results apply also to the problem of 
maximimizing the total welfare of the agents under the constraint that it remain possible for the 
principal to operate without sustaining a loss (Remark 5.1). 

The initial impetus for this study emerged from discussions with Ivar Ekcland. RJM is pleased to 
express his gratitude to Ekeland for introducing him to the principal-agent problem in 1996, and for 
anticipating already at that time that it ought to be tackled using techniques from the mathematical 
theory of optimal transportation. This approach was exploited by Carlier [13] in his doctoral thesis, 
following earlier works by Rochet [48] [49] and Rochet and Chone [50] , and was recently extended to 
a different but related class of problems by Buttazzo and Carlier [10]. We are grateful to Giuseppe 
Buttazzo and Guillaume Carlier also, for stimulating discussions. 



As in Ma, Trudinger and Wang's work concerning the smoothness of optimal mappings [36], let 
us assume the buyer preferences satisfy the following hypotheses. Let X denote the closure of any 
given set X C R", and for each (xo, yo) G X x Y assume: 
(BO) b e C* 4 (X x F), where X C R™ and Y C R" are open and bounded; 



Here the subscript xq serves as a reminder that Y Xa denotes a subset of the cotangent space 
T* Q X= R™ to X at xq. Note (Bl) is strengthened form of the multidimensional generalization 
[55] [20] [31] of the Spence-Mirrlccs single-crossing condition expressed in Ruschcndorf, in Gangbo, 



2. Hypotheses: the basic framework 



(Bl) (bi-twist) 




(B2) (bi-convexity) 
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and in Levin. It asserts the marginal utility of buyer type Xq in equation (4.2) determines the 
product he selects uniquely and smoothly, and similarly that buyer type who selects product yo will 
be a well-defined smooth function of yo and the marginal cost of that product; (Bl) is much less 
restrictive than the generalized single crossing condition proposed by McAfee and McMillan [39], 
since the iso-price curves in the latter context become hyperplanes, effectively reducing the problem 
to a single dimension. We also assume 
(B3) (non-negative cross-curvature) 

(2.1) 



ds 2 dt 2 



b(x(s),y(t)) >0 

(M)=(0,0) 

for each curve t e [—1,1] 1 — > (D y b(x(t),y(0)), D x b(x(0),y(t))) forming an affinely parameterized 
line segment in X y ( ) x Y x ( ) C R 2n . If the inequality (2.1) becomes strict whenever x'(0) and y'(0) 
arc non-vanishing, wc say the preference function b is positively cross-curved, and denote this by 
(B3u). 

Remark 2.1 (Mathematical lineage). Condition (B3) can alternately be defined as in Lemma 6.1 
using Definition 4.1; the convexity asserted by that lemma may appear more intuitive and natural 
than (B3) from point of view of applications. Historically, non-negative cross-curvature arose as a 
strengthening of Trudinger and Wang's criterion (A3w) guaranteeing smoothness of optimal maps in 
the Mongc-Kantorovich transportation problem [62]; unlike us, they require (2.1) only if, in addition, 



(2.2) 



d 



2 



dsdt 



b(x(s),y(t)) = 0. 

(M)=(0,0) 



Necessity of Trudinger and Wang's condition for continuity was shown by Loeper [33], who (like [25] 
[61]) also noted its covariance and some of its relations to the geometric notion of curvature. Their 
condition relaxes a hypothesis proposed with Ma [36], which required strict positivity of (2.1) when 
(2.2) holds. The strengthening considered here was first studied in a different but equivalent form 
by Kim and McCann, where both the original and the modified conditions were shown to corre- 
spond to pscudo-Ricmannian sectional curvature conditions induced by buyer preferences on X x Y . 
thus highlighting their invariance under reparametrization of either X or Y by diffeomorphism; see 
Lemma 4.5 of [25]. Other variants and refinements of Ma, Trudinger, and Wang's condition have 
been proposed and investigated by Figalli and Rifford [19] and Loeper and Villani [35] for different 
purposes at about the same time. 

Kim and McCann showed non-negative cross-curvature guarantees tcnsorizability of condition 
(B3), which is useful for building examples of preference functions which satisfy it [26]; in suitable 
coordinates, it guarantees convexity of each 6-convex function, as they showed with Figalli [18]; see 
Proposition 4.3. Hereafter we show, in addition, that it is necessary and sufficient to guarantee con- 
vexity of the set Vy of 6-convex functions. A variant on the sufficiency was observed simultaneously 
and independently from us in a different context by Sei (Lemma 1 of [57]), who was interested in the 
function b(x, y) = — d|„ (x, y), and used it to give a convex parametrization of a family of statistical 
densities he introduced on the round sphere X = Y = S n . 

3. Results concerning the principal-agent problem 
A mathematical concept of central relevance to us is encoded in the following definition. 
Definition 3.1 (6-convex). A function u : X i — > R is called b-convex if u = {u b ) , where 
(3.1) v b (x) — sup b(x, y) — v(y) and u b (y) = sup b(x, y) — u(x). 
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In other words, if u is its own second b-transform, i.e. a supremal convolution (or generalized 
Legendre transform) of some function v : Y i — ► R U {+00} with b. The set of b-convex functions 
will be denoted by Vy. Similarly, we define the set of b* -convex functions to consist of those 
v : Y 1 — ► R satisfying v = (v b ) b . 

Although some authors permit 6-convex functions to take the value +00, our hypothesis (BO) 
ensures 6-convex functions are Lipschitz continuous and thus that the suprcma defining their b- 
transforms arc finitely attained. Our first result is the following. 

Theorem 3.2 (6-convex functions form a convex set). Assuming b : X xY 1 — > R satisfies (BO)- 
(B2), hypothesis (B3) becomes necessary and sufficient for the convexity of the set Vy of b-convex 
functions on X . 

To understand the relevance of this theorem to economic theory, let us recall a mathematical 
formulation of the principal-agent problem based on [13] and [48] [49]. In this context, each product 
y G Y costs the principal c(y) to manufacture, and she is free to market this product to the population 
X of agents at any lower semicontinuous price v(y) that she chooses. She is aware that product y 
has utility b(x,y) to agent x G X, and that in response to any price menu v(y) she proposes, each 
agent will compute his indirect utility 

(3.2) u(x) = v b (x) := maxb(x, y) — v(y), 

y eY 

and will choose to buy a product y b ,v{x) which attains the maximum, meaning u(x) = b{x, yb v (x)) ~ 
v(yb,v{x)). However, let us assume that there is a distinguished point y% G Y representing the null 
product, which the principal is compelled to offer to agents at zero profit, 

(3.3) v(y$) = c{y$), 

cither because both quantities vanish (representing the null transaction), or because, as in [10], there 
is a competing supplier or regulator from whom the agents can obtain this product at price c(yg). In 
other words, u$(x) := b{x,y^) — c{y^) acts as the reservation utility of agent x G X, below which he 
will reject the principal's offers and decline to participate, whence u > u%. The map y bjV ■ X 1 — ► Y 
from agents to products they select will not be continuous except possibly if the price menu v is 
6*-convex5 when yb,v 

(x) depends continuously on x G X we say v is strictly £>*-convex. 
Knowing b, c and a (Borel) probability measure /ionI — representing the relative frequency 
of different types of agents in the population — the principal's problem is to decide which lower 
semicontinuous price menu v : Y 1 — ► R U {+00} maximizes her profits, or equivalently, minimizes 
her net losses: 

(3.4) / [c(y biV (x))) - v(y b . v {x))]dn(x). 

Jx 

Note the integrand vanishes (3.3)-(3.4) for any agent x who elects not to participate (i.e., who 
chooses the null product yq, £7). 

For absolutely continuous distributions of agents, — or more generally if [i vanishes on Lipschitz 
hypersurfaces — it is known that the principal's losses (3.4) depend on v only through the indirect 
utility u = v b , an observation which can be traced back to Mirrlees [42] in one dimension and 
Rochet [48] more generally; see also Carlier [13]. This indirect utility u > u$ is 6-convex, due to the 
well-known identity ((v b ) b ) b — v b ; see for instance Exercise 2.35 at page 87 of [63]. Conversely, the 
principal can design any 6-convex function u > that she wishes simply by choosing price strategy 
v = u b . Thus, as detailed below, the principal's problem can be reformulated as a minimization 
problem (4.5) on the set Uq := {u G Vy \ u > u®}. Under hypotheses (B0)-(B3), our Theorem 3.2 
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shows the set Vy of such utilities u to be convex, in the usual sense. This represents substantial 
progress, even though the minimization problem (3.4) still depends nonlinearly on v = u b . If, in 
addition, the principal's cost c(y) is a &*-convcx function, then Proposition 4.3 and its corollary show 
her minimization problem (3.4) becomes a convex functional of u on Uq, so the principal's problem 
reduces to a convex program. Necessary and sufficient conditions for a minimum can in principle 
then be expressed using Kuhn- Tucker type conditions, and numerical examples could be solved using 
standard algorithms. However we do not do this here: unless \i is taken to be a finite combination of 
Dirac masses, the infinite dimensionality of the convex set Vy leads to functional analytic subtleties 
even for the bilinear preference function b(x, y) = (x, y), which have only been resolved with partial 
success by Rochet and Chone in that case [50] [14]. If the 6*-convexity of c(y) is strict however, or if 
the preference function is positively cross-curved (B3u), we shall show the principal's program has 
enough strict convexity to yield unique optimal strategies for both the principal and the agents in a 
sense made precise by Theorem 4.5. These optimal strategics represent a Stackclbcrg (rather than 
a Nash) equilibrium, in the sense that no party has any incentive to change his or her strategies, 
given that the principal must commit to and declare her strategy before the agents select theirs. 

Of course, it is of practical interest that the principal be able to anticipate not only her optimal 
price menu v : Y i — ► R U {+00} — also known as the equilibrium prices — but the corresponding 
distribution of goods which she will be called on to manufacture. This can be represented as a Borcl 
probability measure v on Y, which we call the optimal production measure. It quantifies the relative 
frequency of goods to be produced, and is the image of pt under the agents' best response function 
yb,v ■ X 1 — > Y to the principal's optimal strategy v. This image v = {yb.v)#^ is a Borcl probability 
measure on Y known as the push- forward of /i by yb, v , and is defined by the formula 

(3.5) v(W) := fiKl(W)} 

for each W C Y. Theorem 4.5 asserts the optimal production measure v is unique and the optimal 
price menu v is uniquely determined z^-a.e.; the same theorem gives a sharp lower bound for v 
throughout Y. If the convex domain X ytb is strictly convex and the density of agents is Lipschitz 
continuous on X, Theorem 4.7 goes on to assert that these prices will be high enough to drive a 
positive fraction of agents out of the market, extending Armstrong's desirability of exclusion [3] 
to a rich class of multidimensional models. Thus the goods to be manufactured and their prices 
arc uniquely determined at equilibrium, and the principal can price the goods she prefers not to 
trade arbitrarily high but not arbitrarily low. Theorem 4.5 goes on to assert that the optimal 
strategy yb,v(%) is also uniquely determined for /^-almost every agent x by 6, c and /j, for each Borcl 
probability measure /1 on X. Apart from Theorem 4.7, these conclusions apply to singular and 
discrete measures as well as to continuous measures /i, assuming the tie-breaking conventions of 
Remark 4.2 are adopted whenever fi fails to vanish on each Lipschitz hypcrsurfacc. 

A number of examples of preference functions b{x,y) which satisfy our hypotheses are developed 
in [19] [25] [26] [29] [30] [34] [36] [62]. Here we mention just three: 

Example 3.3. For single dimensional type and allocation spaces n = 1, hypotheses (B1)-(B2) 
are equivalent to asserting that the preference function b(x, y) be defined on a product of two 
intervals where its cross-partial derivatives b xy do not vanish. Positive cross-curvature (B3u) asserts 
that Dxyb in turn satisfies a Spence-Mirrlees condition, by having positive cross-partial derivatives: 
D%{Dl y b) > 0. 

Example 3.4. The bilinear preference function b(x, y) = x ■ y of Armstrong, Rochet and Chone 
satisfies (B0)-(B3) provided only that X, Y C R" are convex bodies. In this case b- and b*- 
convexity coincide with ordinary convexity. Thus Theorem 4.5 asserts that any strictly convex 
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manufacturing cost c{y) leads to unique optimal strategies for the principal and for /x-almost every 
agent. This uniqueness is well-known for absolutely continuous measures d/i <C eJvol [50], and Carlicr 
and Lachand- Robert have extended Mussa and Rosen's differentiability result u £ C 1 (X) to n > 1 in 
that case [15] [46], but the uniqueness of optimal strategies under the tie-breaking rules described in 
Remark 4.2 may be new results when applied, for example, to discrete distributions [i concentrated 
on finitely many agent types. 

Example 3.5. Ma, Trudinger and Wang's perturbation b(x,y) = x ■ y + F(x)G(y) of the bilinear 
preference function is non-negatively cross-curved (B3) provided F £ C 4 (A) and G £ C 4 (V) are 
both convex [36] [25]; it is positively cross-curved if the convexity is strong, meaning both F(x) — e\x\ 2 
and G(y) — e\y\ 2 remain convex for some e > 0. It satisfies (BO)-(Bl) provided sup^g^ \DF{x)\ < 1 
and sup ygy \DG(y)\ < 1, and (B2) if the convex domains X and Y C R™ are sufficiently convex, 
meaning all principal curvatures of these domains are sufficiently large at each boundary point [36] . 
On the other hand, b(x, y) = x-y+F(x)G(y) will violate (B3) if D 2 F(x ) > holds but D 2 G(y Q ) > 
fails at some (xo>J/o) S X x Y. 

In the next section we formulate the results mathematically Let us first highlight one implication 
of our results concerning robustness of the phenomena observed by Rochet and Chone. Their 
bilinear function b(x, y) = x ■ y lies on the boundary of the set of non-negatively cross-curved 
preference functions, since its cross-curvature (2.1) vanishes identically. Our results show non- 
negative cross-curvature (B3) to be a necessary and sufficient condition for the principal-agent 
problem to be a convex program: the feasible set V|r becomes non-convex otherwise, and it is 
reasonable to expect that uniqueness of the solution among other phenomena observed in [50] may 
be violated in that case. In analogy with the discontinuities discovered by Loeper [33], we therefore 
conjecture that the bundling discovered by Rochet and Chone is robust with respect to perturbations 
of the bilinear preference function which respect (B0)-(B3), but not generally with respect to 
perturbations violating (B3). 

4. Mathematical formulation 
Any price menu v : Y i — ► RU {oo} satisfies 

(4.1) v b (x) +v(y) -b(x,y) > 

for all (y,x) £ Y x X, according to definition (3.1). Comparison with (3.2) makes it clear that a 
(product, agent) pair produces equality in (4.1) if and only if selecting product y is among the best 
responses of agent x to this menu; the set of such best-response pairs is denoted by d b v C Y x X; see 
also (A. 2). We think of this relation as giving a multivalued correspondence between products and 
agents: given price menu v the set of agents (if any) willing to select product y is denoted by d h v(y). 
It turns out d b v(y) is non-empty for all y £ Y if and only if v is 6*-convex. Thus 6*-convexity of 
v — or of c — means precisely that each product is priced low enough to be included among the 
best responses of some agent or limiting agent type x £ X. As we shall see in Remark 4.2, assuming 
6*-convexity of v costs little or no generality; however, the 6*-convexity of c is a real restriction 
- but plausible when the product types Y C R" represent mixtures (weighted combinations of 
pure products) which the principal could alternately choose to purchase separately and then bundle 
together; this becomes natural in the context of von-Neumann and Morgenstern preference functions 
[64] like the one used by Rochet and Chone [50]. 

Let Dom£>u C X denote the set where u is differentiable. If y is among the best responses of 
agent x £ Domf to price menu v, the equality in (4.1) implies 

(4.2) Dv\x) =D x b(x,y). 
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In other words y = yb{x, Dv b (x)), where y b is defined as follows: 

Definition 4.1. For each x G X and q s Y x , let us define y b {x,q) to be the unique solution to 

(4.3) D x b(x,y b (x,q)) = q 

guaranteed by (Bl). The map y b (which is defined on a subset of the cotangent bundle T*X and 
takes values inY) has also been called the b-exponential map [33], and denoted by y b (x, q) = b-Exp x q. 

The fact that the best response function takes the form y = yb(x, Dv b (x)), and that DomDv b 
exhausts X except for a countable number of Lipschitz hypersurfaces, are among the key observations 
exploited in [20] [31] following special cases worked out in [3] [8] [11] [21] [40] [50]. Indeed, v b is 
well-known to be a 6-convex function. It is therefore Lipschitz and semiconvex, satisfying the bounds 

(4.4) \Dv b \ < \\c\\ / \, D 2 v b >-\\c\\ „/ \ inside X. 

The second equation above holds in the distributional sense, and implies the differentiability of v 
outside a countable number of Lipschitz hypersurfaces. 

Assuming (i assigns zero mass to each Lipschitz hypersurface (and so also to a countable number 
of them), the results just summarized allow the principal's problem (3.4) to be re-expressed in the 
form min{L(it) | u £ Wo}, where the principal's net losses are given by 

(4.5) L(u) := J [u{x) + c(y b (x, Du(x))) - b(x, y b (x, Du(x)))]dfi(x) 

as is by now well-known [13]. Here Uq = {u 6 V-^- | u > uq} denotes the set of 6-convex functions 
on X dominating the reservation utility u$(x) = &(x, y©)— c(?/0), and the equality produced in (4.1) 
by the response y btV (x) = yb(x, Dv b (x)) for /i-a.e. x has been exploited. Our hypothesis on the 
distribution of agent types holds a fortiori whenever fj, is absolutely continuous with respect to 
Lebesgue measure in coordinates on X. If no such hypothesis is satisfied, the reformulation (4.5) of 
the principal's net losses may not be well-defined, unless we extend the definition of Du(x) to all of 
X by making a measurable selection from the relation 

du(x) := {q E R" | u(z) > u(x) + q ■ (z - x) + o(\z - x\) Vzel] 

consistent with the following tie-breaking rule, analogous to one adopted, e.g., by Buttazzo and 
Carlier in a different but related context [10]: 

Remark 4.2. [Tic-brcaking rules for singular measures] When an agent x remains indifferent between 
two or more products, it is convenient to reduce the ambiguity in the definition of his best response 
by insisting that yb,v{x) be chosen to maximize the principal's profit v (y) — c(y), among those 
products y which maximize (3.2). We retain the result yb, v {x) = yb(x, Dv b (x)) by a corresponding 
selection Dv b (x) G dv b (x). This convention costs no generality when the distribution /i of agent 
types vanishes on Lipschitz hypersurfaces in X. since u = v b is then diffcrentiablc /i-a.e.; in the 
remaining cases it may be justified by assuming the principal has sufficient powers of persuasion to 
sway an agent's choice to her own advantage whenever some indifference would otherwise persist 
between his preferred products [42] . After adopting this convention, it costs the principal none of her 
profits to restrict her choice of strategies to 6*-convex price menus v = (v b ) b , a second convention 
we also choose to adopt whenever fi fails to vanish on each Lipschitz hypersurface. 

The relevance of Theorem 3.2 to the principal-agent problem should now be clear: it guarantees 
convexity of the feasible set Uq in (4.5). Our next proposition addresses the convexity properties 
of the principal's objective functional. Should convexity of this objective be strict, then the best 
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response yi, tV (x) selected by the tie-breaking rule above becomes unique — which it need not be 
otherwise. 

Proposition 4.3 (Convexity of the principal's objective). If b G C 4 (A x Y) satisfies (B0)-(B3) 
and c : Y i — > R is b* -convex, then for each x £ X, definition (4.3) makes a{q) := c(yj,(x, q)) — 
b(x,yb(x,q)) a convex function of q on the convex set Y x := D x b(x,Y) C R". The convexity of 
a(q) is strict if either c is strictly b* -convex — meaning the efficient allocation y b , c '■ X i — ► Y is 
continuous — or alternately if 

(4.6) q^Y x i — ► b(x , y b (x, q)) - b(x, y b (x, q)) 

is a strictly convex function of q for each x, Xq G X. If the preference function b is positively cross- 
curved on X x Y, then convexity of a(q) is strong (meaning a(q) — e\q\ 2 /2 remains convex on Y x 
for some e > Oj. 

Strict convexity of (4.6) may subsequently be denoted by (B3s). As an immediate corollary to 
Theorem 3.2 and Proposition 4.3, we have convexity of the principal's optimization problem. 

Corollary 4.4 (Convexity of the principal's minimization). Let the distribution of agent types be 
given by a Borel probability measure fi on X C R". Unless fi vanishes on all Lipschitz hyper surf aces, 
adopt the tie-breaking conventions of Remark 4-2. If the preference b(x, y) of agent igl for product 
y G Y satisfies (B0)-(B3) and the principal's manufacturing cost c : Y i — ► R is b* -convex, then 
the principal's problem (4.5) becomes a convex minimization over the convex setlA®. 

As a consequence, we obtain criteria guaranteeing uniqueness of the principal's best strategy. 

Theorem 4.5 (Criteria for uniqueness of optimal strategies). Assume the notation and hypotheses 
of Corollary 4-4- Suppose, in addition, that the manufacturing cost c is strictly b* -convex, or that the 
preference function b is positively cross-curved (B3u), or that b satisfies (B3s), as in (4.6). Then 
the equilibrium response of fx -almost every agent is uniquely determined, as is the optimal measure v 
from (3.5); (always assuming the tie-breaking conventions of Remark 4-2 to be in effect if fi does not 
vanish on each Lipschitz hypersurface) . Moreover, the principal has two optimal strategies u± G Uq 
which coincide at least fi-almost everywhere, and sandwich all other optimal strategies u G Uq between 
them: u_ < u < u + on X . Finally, a lower semicontinuous v : Y i — ► R U {+00} is an optimal 
price menu if and only if v > u b + throughout Y , with equality holding v-almost everywhere. 

This theorem gives hypothesis which guarantee — even for discrete measures fj, corresponding to 
finitely many agent types — that the solution to the principal's problem is unique in the sense that 
optimality determines how many of each type of product the principal should manufacture, what 
price she should charge for each of them, and which product will be selected by almost every agent. 
A lower bound is specified on the price of each product which she docs not wish to produce, to 
ensure that it does not tempt any agent. When fi vanishes on Lipschitz hypersurfaccs, this solution 
represents the only Stackelberg equilibrium balancing the interests of the principal with those of 
the agents; for more singular fi, it is possible that other Stackelberg equilibria exist, but if so they 
violate the restrictions imposed on the behaviour of the principal and the agents in Remark 4.2. 

The uniqueness theorem has as its corollary the following stability result concerning optimal 
strategies. Recall that a sequence °f Borel probability measures on a compact set X C R" 

is said to converge weakly-* to /ioo if 

(4.7) / g{x)dfi OQ {x) = lim / g(x)dni(x) 

Jx Jx 
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for each continuous test function g : X i — > R. This notion of convergence makes the Borel proba- 
bility measures r ?(X^j on X into a compact set, as a consequence of the Riesz-Markov and Banach- 
Alaoglu theorems. 

Corollary 4.6 (Stability of optimal strategies). For each i £ NU{oo}, let the triple (bi, Ci, /Ltj) consist 
of a preference function bi : X x Y i — > R, manufacturing cost a : Y i — ► R, and a distribution 
of agent types fii on X satisfying the hypotheses of Theorem 4-5. Let Ui : X i — > R denote a bi- 
convex utility function minimizing the losses of a principal faced with data (bi, c,, /Ltj). Suppose that 
bi — > 6oo in C 2 (A x "F), q — > Cqo uniformly on Y, and /z, — 11 /ioo weakly-* as i — > oo. Assume 
finally that /i^ vanishes on all Lipschitz hyper surf aces. For fiao-a.e. agent x £ X, £/ie product 
Gi(x) := ybi(x, Dm(x)) selected then converges to G OC3 (x). The optimal measures V{ := (Gi)#fii 
converge weakly-* to Voc as i — > oo. j4nd £/ie principal's strategies converge uniformly in the sense 
that lirn^oo \\m - Uoo|U~(X,d// 00 ) = 0. 

Finally as evidence for the robustness of bunching phenomena displayed by our models, we show 
the desirability of exclusion phenomenon found by Armstrong for preference functions b(x, y) = 
ST=i x ibi(y) which are linear in a; — or more generally homogeneous [3] — extends to the full range 
of non-negatively cross-curved models. We assume strict convexity on the domain X yi} := D y b(X, y$) 
(see Remark 4.8), and that the distribution of agent types dfj.(x) = f(x)dx has a Sobolev density 
— denoted / € W /1,1 (X) and meaning both the function and its distributional derivative Df are 
given by Lebesguc intcgrable densities. This is satisfied a fortiori if / is Lipschitz or continuously 
differentiable (as Armstrong assumed). The exclusion phenomenon is of interest, since it confirms 
that a positive fraction of customers must be excluded from participation at equilibrium, thus 
ensuring elasticity of demand. 

Theorem 4.7 (The desirability of exclusion). Let the distribution dfi(x) = f(x)dx of agent types 
be given by a density f £ W 1 ' 1 on X C R™. Assume that the preference b(x, y) of agent x € X for 
product y <EY satisfies (B0)-(B3) and the principal's manufacturing cost c : Y i — > R is b* -convex. 
Suppose further that the convex domain X yi} = D x b(X,y$) has no n — 1 dimensional facets in its 
boundary. Then any minimizer u £ Uq of the principal's losses (4.5) coincides with the reservation 
utility on a set Uq := {x £ X | u(x) = b(x,y$) — c(y$)} whose interior contains a positive fraction 
of the agents. Such agents select the null product y$. 

Remark 4.8 (Facets and exclusion in different dimensions). A convex domain X C R" fails to 
be strictly convex if it has line segments in its boundary. These segments belong to facets of 
dimension 1 or higher, up to n — 1 if the domain has a flat side (meaning a positive fraction of its 
boundary coincides with a supporting hyperplanc). Thus strict convexity of X ytl is sufficient for 
the hypothesis of the preceding theorem to be satisfied — except in dimension n — 1. In a single 
dimension, every convex domain X C R is an interval — hence strictly convex — whose endpoints 
form zero-dimensional facets. Thus Theorem 4.7 is vacuous in dimension n = 1, which is consistent 
with Armstrong's observation the necessity of exclusion is a hallmark of higher dimensions n > 2. 
More recently, Deneckere and Severinov [16] have argued that necessity of exclusion is specific to the 
case in which the dimensions m and n of agent and product types coincide. When (to, n) = (2, 1) 
they give necessary and sufficient conditions for the desirability of exclusion, yielding a result quite 
different from ours in that exclusion turns out to be more frequently the exception than the rule. 

5. Discussion, extension, and conclusions 

The role of private information in determining market value has a privileged place in economic 
theory, acknowledged by the award of the Nobel Memorial Prize in Economic Sciences to Mirrlecs 
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and Vickrey in 1996, and to Akerlof, Spence and Stiglitz in 2001. This phenomenon has been deeply 
explored in the principal-agent framework, where a single seller (or single buyer) transacts business 
with a collection of anonymous agents. In this context, the private (asymmetric) information takes 
the form of a characteristic i£l peculiar to each individual buyer which determines his preference 
b{x, y) for different products y <E Y offered by the principal; x remains concealed from the principal 
by anonymity of the buyer — at least until a purchase is made. Knowing only the preference function 
b(x,y), the statistical distribution dfi(x) of buyer types, and her own manufacturing costs c(y), the 
principal's goal is to fix a price menu for different products which maximizes her profits. 

Many studies involving finite spaces of agent and product types X and Y have been carried out, 
including Spence's initial work on labour market signalling [58] and Stiglitz paper with Rothschild 
on insurance [54] . However for a principal who transacts business with a one-dimensional continuum 
of agents X C R, the problem was solved in Mirrlees' celebrated work on optimal taxation [42], and 
in Spence's study [59], assuming the contract types jeFcR are also parameterized by a single 
real variable. (For Mirrlees, y G R represented the amount of labour an individual chooses to do 
facing a given tax schedule, while for Spence it represented the amount of education he chooses to 
acquire facing a given range of employment possibilities, x G R being his intrinsic ability in both 
cases). In the context of nonlinear pricing discussed above, the one-dimensional model was studied 
by Mussa and Rosen [46] . The challenge of resolving the multidimensional version X,Y C R" of this 
archetypal problem in microeconomic theory has been highlighted by many authors [39] [65] [44] [51] 
[7]. When only one side of the market displays multidimensional types, analyses have been carried out 
by Mirman and Sibley [41], Roberts [47] and Spence [60], who allow multidimensional products, and 
by Laffont, Maskin and Rochet [27], Araujo, Gottlieb and Moreira [2], and Deneckere and Severinov 
[16] who model two-dimensional agents choosing from a one-dimensional product line. When both 
sides of the market display multidimensional types, existence of an equilibrium has been established 
by Monteiro and Page [45] and by Carlier [13], who employed a variational formulation; see also the 
control-theoretic approach of Basov [6] [7]. However, non-convexities have rendered the behaviour of 
this optimization problem largely intractable [24] — unless the preference function b(x, y) = x-G(y) is 
assumed to depend linearly on agent type [66] [3] [50]. Moreover, the presence of convexity typically 
depends on a correct choice of coordinates, so is not always easy to discern. The present study treats 
general Borel probability measures (ionic R", and provides a unified framework for dealing with 
discrete and continuous type spaces, by invoking the tie-breaking rules of Remark 4.2 in case fi is 
discrete. Assuming 6*-convexity of c, we consider preferences linear in price (3.2) (sometimes called 
quasilinear), which satisfy a generalized Spence-Mirrlees single crossing condition (BO)-(Bl) and 
appropriate convexity conditions on its domain (B2), and we identify a criterion (B3) equivalent to 
convexity of the principal's optimization problem (Theorem 3.2). This criterion is a strengthening of 
Ma, Trudinger and Wang's necessary [33] and sufficient [36] [62] condition for continuity of optimal 
mappings. Like all of our hypotheses, it is independent of the choice of parameterization of agent 
and/or product types — as emphasized in [25]. We believe the resulting convexity is a fundamental 
property which will eventually enable a more complete theoretical and computational analysis of the 
multidimensional principal-agent problem, and we indicate some examples of preference functions 
which satisfy it in Examples 3.3-3.5; the bilinear example b(x, y) = x ■ y of Rochet and Chone lies on 
the boundary of such preference functions. If cither the cross-curvature inequality (B3) holds strictly 
or the 6*-convexity of c(y) is strict — meaning the efficient solution j/b jC (x) depends continuously 
on x € X — we go on to derive uniqueness and stability of optimal strategies (Theorem 4.5 and its 
corollary). Under mild additional hypotheses we confirm that a positive fraction of agents must be 
priced out of the market when the type spaces are multidimensional (Theorem 4.7). We conjecture 
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that non-negative cross-curvature (B3) is likely to be necessary and sufficient for robustness of 
Armstrong's desirability of exclusion [3] and the other bunching phenomena observed by Rochet 
and Chone [50]. 

Remark 5.1 (Maximizing social welfare under profitability constraints). Before concluding this pa- 
per, let us briefly mention an important class of related models to which the same considerations 
apply: namely, the problem of maximizing the expected welfare of the agents under a profitability 
constraint on the principal. Such a model has been used by Roberts [47] to study energy pricing by 
a public utility, and explored by Spcnce [60] and Monteiro and Page [45] in other contexts. Suppose 
the welfare of agent x £ X is given by a concave function w(u(x)) of his indirect utility (3.2). Intro- 
ducing a Lagrange multiplier A for the profitability constraint L(u) < 0, the problem of maximizing 
the net social welfare over all agents becomes equivalent to the maximization 

W(\) := max — XL(u) + / w(u(x))d[i(x) 

u&Ao Jx 

for some choice of A > 0. Assuming (B0)-(B3), and 6*-convexity of c, for each A > this amounts 
to a concave maximization on a convex set, as a consequence of Theorem 3.2, Proposition 4.3 and 
the concavity of w. Theorem 4.5 and its corollary give hypotheses which guarantee uniqueness and 
stability of its solution u\; if the concavity of w is strict, we obtain uniqueness ^-a.e. of u\ more 
directly under the weaker hypotheses of Corollary 4.4. Either way, once the uniqueness of u\ has 
been established, standard arguments in the calculus of variations show the convex function 1U(A) 
to be continuously differentiable, and that each value of its derivative ^'(A) = —L(u\) corresponds 
to a possibly degenerate interval A £ [Ai, A2] on which u\ is constant; sec e.g. Corollary 2.11 of [12]. 
Uniqueness of a social welfare maximizing strategy subject to any budget constraint in the range 
]L(uo), -L(itoo)[ is therefore established; this range contains the vanishing budget constraint as long 
as L(uq) > > L(uoo); here uq represents the unconstrained maximizer whereas Moo £ minimizes 
the principal's losses (4.5). All of our results — except for the desirability of exclusion (Theorem 
4.7) — extend immediately to this new setting. This sole exception is in accord with the intuition 
that it need not be necessary to exclude any potential buyers if one aims to maximize social welfare 
instead of the monopolist's profits. 

6. Proofs 

Let us recall a characterization of non-negative cross-curvature from Theorem 2.11 of [26] , inspired 
by Loeper's characterization [33] of (A3w). We recall its proof partly for the sake of completeness, 
but also to extract a criterion for strong convexity. 

Lemma 6.1 (Characterizing non- negative cross-curvature [26]). A preference function b satisfying 
(B0)-(B2) is non-negatively cross-curved (B3) if and only if for each x,xi £ X 

(6.1) q £ Y x 1 — > b(x 1 ,y b (x, q)) - b(x, y b (x, q)) 

is a convex function. If the preference function is positively cross-curved, then (6.1) will be strongly 
convex (meaning its Hessian will be positive definite). 

Proof. Fix x, X\ & X and set q t := (1 - t)q n + tqi and f(-,t) := b(- , yb(x , q t )) - b(x,yb(x,q t )) for 
t £ [0, 1]. Given t £ [0, 1], use (B1)-(B2) to define the curve s £ [0, 1] 1 — > x s £ X~ for which 

(6.2) D x b(x s ,y to ) = (1 - s)D x b(x,y ta ) + sD^x^yto), 

and set g{s) = -gp-(x s ,to). The convexity of (6.1) will be verified by checking g(l) > 0. Let us 
start by observing s £ [0, 1] 1 — > g(s) is a convex function, as a consequence of property (B3) and 
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(6.2); (according to Lemma 4.5 of [25], inequality (2.1) follows from (B3) whenever either of the 
two curves s G [0,1] i — > (D y b(x(s),y(0)) or s £ [0,1] i — > D x b(x(0),y(s))) is a line segment). We 
next claim <j(s) is minimized at s — 0, since 
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(D x b(x 3 ,y b (x , (1 - i)g +%)),* s ) 

t=to 

vanishes at s = 0, by the definition (4.3) of y b . Thus g(l) > g(0) — 0, establishing the convexity of 
(6.1). If b is positively cross-curved, then g"(s) > and the desired strong convexity follows from 
5 (1) > ff (0) = 0. _ 

Conversely, if the convexity of (6.1) fails we can find xi G X and so, to G [0,1] for which the 
construction above yields g" (so) < 0. In view of Lemma 4.5 of [25], this provides a contradiction to 
(2.1). □ 

We shall also need to recall two basic facts about 6-convex functions from e.g. [21]: any supremum 
of 6-convex functions is again 6-convex, unless it is identically infinite; and for each y G Y and A G R, 
the function 

(6.3) i£li — >b(x,y)-\ 

is 6-convcx. Functions of the form cither y G Y < — > b(x, y) — A or (6.3) are sometimes called 
mountains below. 

Proof of Proposition 4-3. The 6*-convexity of the manufacturing cost c = (c b ) b asserts 

c(y) = sup_b(x,y) - c b (x) 

is a supremum of mountains, whence 

a(q) ■= c(y b (x, q)) - b(x, y b (x, q)) = sup_6(a;o, y b (x, q)) - b(x, y b {x, q)) - c b {x ) 

x ex 

for all x G X and q G Y x . According to Lemma 6.1, we have just expressed a(q) as a supremum 
of convex functions, thus establishing convexity of &(<?)• The functions under the supremum are 
strictly convex if (4.6) holds, and strongly convex if b is positively cross-curved, thus establishing 
the strict or strong convexity of a(q) under the respective hypotheses (B3s) and (B3u). 

The remainder of the proof will be devoted to deducing strict convexity of a(q) from strict b*- 
convexity of c{y) assuming only (B3). Recall that strict 6*-convcxity was defined by continuity of the 
agents' responses y biC : X i — ► Y to the principal's manufacturing costs (as opposed to the prices the 
principal would prefer to select). Fix i£l and use the C 3 change of variables q G Y x i — ► y b (x, q) G 
Y to define &(•, q) := &(•, y b (x, q)) - b(x, y b (x, q)) and c(q) := c(y b (x, q)) - b(x, y b {x, q)) = a(q). As in 
[18], it is easy to deduce that b satisfies the same hypotheses (B0)-(B3) on X x Y x as the original 
preference function — except for the fact that b G C 3 whereas b G C 4 . For the reasons explained 
in [18] this discrepancy shall not trouble us here: we still have continuous fourth derivatives of b 
as long as at least one of the four derivatives is with respect to a variable in X, and at most three 
derivatives are with respect to variables in Y x . Note also that c b = c b and the continuity of the 
agents' responses y^ ~ in the new variables follows from their presumed continuity in the original 
variables, since %.£:(') = D x c{x,y b . c {-)). 

The advantage of the new variables is that for each xq G X, the mountain q G Y x i — > b(xo,q) 
is a convex function, according to Lemma 6.1; (alternately, Theorem 4.3 of [18]). To produce a 
contradiction, assume convexity of c(q) fails to be strict, so there is a segment t G [0, 1] i — ► qt G Y x 
given by qt = (1 — t)qo + tq\ along which c is affinc with the same slope p G dc(qt) for each 
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t G [0,1]. In fact, the compact convex set dc{q t ) is independent of t G]0, 1[, so taking p to be an 
extreme point of dc(qt) allows us to find a sequence qt.k &Y x n DomDc converging to qt such that 
p — limfc^oo Dc(qt,k), by Theorem 25.6 of Rockafellar [53]. On the other hand, 6*-convexity implies 
c(q) is a suprcmum of mountains: thus to each t G [0, 1] and integer k corresponds some xt,k G X 
such that (xt,k,Qt,k) G 9 b c, meaning 

(6.4) c(q) > b(x t ,k,q) ~ b( x t,k,qt,k) + c(q t ,k) 

for all q G Y x . Since q t .k G DomDc, saturation of this bound at qt t k implies Dc{q t ,k) = D q b{x t .k, qt,k)- 
Compactness of X allows us to extract a subsequential limit (xt,k,qt,k) —> (xt,qt) G d b c satisfying 
p = D q b(x t , q t ). This first order condition shows the curve t 6 [0, 1] i — ► Xt G X to be differentiable, 
with derivative 

(6.5) x t = -D 2 q J){xt, qt^Dgqbfa, q t )qt, 

by the implicit function theorem and (Bl). On the other hand, both c(-) and b(xt, •) are convex 
functions of q G Y x in (6.4), so both must be affine along the segment q t . This implies q t = qi — qo is 
a zero eigenvector of D qq b(xt,qt), which in turn implies xt = const from (6.5). On the other hand, 
the efficient response q t = y~ b g(#t) of agent xt to price menu c is not constant, since the endpoints 
qo ^ q\ of the segment are distinct. This produces the desired contradiction and establishes strict 
convexity of c. □ 

Combining Proposition 4.3 with the following standard lemma will allow us to establish our 
necessary and sufficient criteria for convexity of the feasible set Uq. 

Lemma 6.2 (Identification of supporting mountains). Let u be a b-convex function on X. Assume 
u is differentiable at xq G X and D x u(xq) = D x b(xo,y) for some y G Y . Then, u{x) > m(x) for all 
x G X , where m(-) = &(•, y) — b(xo, y) + u(xq). 

Proof. By 6-convexity of u, there exists yo G Y such that u(xo) = b(xo, yo) — u b (yo) and also u(x) > 
b(x, yo) —u b (yo) for all x G X . Since u is differentiable at xq, this implies D x u(xo) = D x b(xo, yo)- By 
the assumption (Bl), we conclude y = ijq. This completes the proof since m(-) = b(-, yo)—u b {yo). □ 

Proof of Theorem 3.2. Let us first show the sufficiency. It is enough to show that for any two b- 
convex functions uq and u\, the linear combination u t := (1 — t)uo + tu\ is again 6-convex, for each 
< t < 1. Fix xq £ X. Since ^-convex functions are defined as suprema of mountains, there exist 
2/0 j Hi G Y such that 

m? (•) := b(-,yi) - b(x ,yi), i = 0,1, 
satisfy Ui(x) > m^°(x) + Ui(xo) for all x G X. Clearly equality holds when x = xq. Let us consider 
the function 

m t"(-) = b (-,Vt) - b(x ,y t ), 

where y t defines a line segment 

t G [0, 1] i— » D x b(x ,y t ) = (1 - t)D x b(x , y ) +tD x b(x ,yi) G R n . 

Note that (i) m x t °(xo) = 0. We claim that (ii) tt t (.) > m x t °(-) + u t (x Q ). Notice that 

ut(0 > (l-t)mZ°(-)+tmf° (-)+u t (xo). 

Thus the claim follows from the inequality (1 — t)rriQ + tm xa > m x ° , which is implied by (B3) 
according to Lemma 6.1. The last two properties (i) and (ii) enable one to express ut as a suprcmum 
of mountains 

ut(-) = sup mf°(-) +u t (xo), 
x ex 
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hence u t is ^-convex by the remark immediately preceding (6.3). 

Conversely, let us show the necessity of (B3) for convexity of Vy. Using the same notation 
as above, recall that each mountain m x ° , i = 0, 1 is 6-convex. Assume the linear combination 
h t := (l-t)mQ°+tml° is 6-convcx. Since D x h(x ) = {l-t)D x b(x 0l ya)+tD x b(x 0l yi)) = D x m t (x ), 
Lemma 6.2 requires that < h t for every < t < 1. This last condition is equivalent the property 
characterizing nonncgative cross-curvature in Lemma 6.1. This completes the proof of necessity and 
the proof of the theorem. □ 

Let us turn now to the convexity of the principal's problem. 

Proof of Corollary 4-4- Corollary 4.4 follows by combining the convexity of the set Uq of feasible 
strategies proved in Theorem 3.2 with the convexity of a(q) from Proposition 4.3. If \i fails to 
vanish on each Lipschitz hypersurface, a little care is needed to deduce convexity of the principal's 
objective L(u) from that of a(q), by invoking the conventions adopted in Remark 4.2 as follows. Let 
t G [0, 1] i — > u t = (1 — t)uo + tu\ denote a line segment in the convex set Uq. If q G du t (x) for some 
x G X, then yb(x,q) G d b u t (x) by Theorem 3.1 of Loeper [33]; (a direct proof along the lines of 
Lemma 6.1 may be found in [25]). So yb(x, q) is among the best responses of x to price menu v t — u\ . 
For each t G [0,1] select Dut{x) G du t (x) measurably to ensure min{c(y&(x, q)) — b(x,yb(x,q)) 
q G du t (x)} is achieved at q = Du t (x). Then a(Dut(x)) < a((l — t)Duo(x) + tDu\(x)) since 
(1 — t)Duo{x) + tDui(x) G du t (x). The desired convexity of L(u) follows. □ 

Next we establish uniqueness of the principal's strategy. 

Proof of Theorem 4-5. Suppose both uq and u\ minimize the principal's net losses L{u) on the 
convex set Uq. Define the line segment u t = (1 — t)uo + tu\ and — in case \x fails to vanish 
on each Lipschitz hypersurface — the measurable selection Dut(x) G dut(x) as in the proof of 
Corollary 4.4. The strict convexity of a(q) asserted by Proposition 4.3 removes all freedom from this 
selection. Under the hypotheses of Theorem 4.5, the same strict convexity implies the contradiction 
L(ui/ 2 ) < ^L(u ) + ^L(ui) = L(ui) unless Du — Dui holds ^i-a.e. This establishes the uniqueness 
^t-a.e. of the agents' equilibrium strategies yb,v{%) '■= yb{x, Dui(x)), and of the principal's optimal 
measure v := {yb,v)#^ in (3.5). 

Let spt /i denote the smallest closed subset of X containing the full mass of /i. To identify 
uq = ui on spt [i and establish the remaining assertions is more technical. First observe that the 
participation constraint 1*1/2(2;) > b(x,y$) — 0(1/0) =: u%(x) on the continuous function ui/ 2 G Uo 
must bind for some agent type Xq G spt /i; otherwise for e > sufficiently small, maxl?^^ — e,ug} 
would belong to Uq and reduce the principal's losses by e, contradicting the asserted optimality of 
Ui/2- Since Ui/ 2 is a convex combination of two other functions obeying the same constraint, we 
conclude uq(xq) = ui(xq) coincides with the reservation utility u$(xq) for type xq- Now use the map 
Ub,v '■= yb Dui from the first paragraph of the proof to define a joint measure 7 := (id x yb^ v )#l^ 
given by ^[U x V] — ^i\U x y^(U)] for Borcl U x V C X x Y, and denote by spt 7 the smallest 
closed subset S G X x Y carrying the full mass of 7. Notice spt 7 does not depend on t G [0, 1], nor 
in fact on uq or u\] any other optimal strategy for the principal would lead to the same 7. 

Since the graph of yb, v lies in the closed set d b u\ G XxY, the same is true of S := {{xq, i/0)}Uspt 7. 
Thus S is 6-cyclically monotone (A.l) by the result of Rochet [49] discussed immediately before 
Lemma A.l. Lemma A.l then yields a minimal 6-convex function w_ satisfying u-(xq) = b(xo, y%) — 
c(y$) for which S G <9 fc it_. The fact that (xo,y$)) G S implies some mountain b(-,y$) + A bounds 
u_(-) from below with contact at xq- Clearly A = —c(y%) whence u_ G Uq. 
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Now we have Ui > u_ for i = 0, 1 with equality at xq- Also, Ubv( x ) £ d b u^(x) for \x almost all 
x, whence u_ must be an optimal strategy: it is smaller in value than m and produces at least as 
favorable a response as Uj from almost all agents. Finally since 

L(ui) — L{u_) > / (ui(x) — u^(x))dfj,(x) > 0, 

the fact that itj minimizes the losses of the principal implies the continuous integrand vanishes 
/z-almost everywhere. Thus Uj > on X, with equality holding throughout spt [i as desired. 

Since uq was arbitrary, we have now proved that all optimal u G Uq coincide with u\ on spt jj,. 
Optimality of u also implies spt 7 C d b u; if in addition the participation constraint u(x) > b(x,y$) — 
0(2/0) binds at xq, then u > u_ on X. Although u- appears to depend on our choice of xq G spt fi 
in the construction above this is not actually the case: u(xq) = ui(xq) shows the participation 
constraint binds at xo for every optimal strategy and U- is therefore uniquely determined by its 
minimality among optimal strategics u G Uq. 

Now, since any suprcmum of 6-convex functions (not identically infinite) is again 6-convcx, define 
u + G Uq as the pointwise supremum among all of the principal's equilibrium strategies u &IAq. The 
foregoing shows u + = w_ on spt fi, while (x, y) G spt 7 C d b u implies 

«+(•) > «(•) > u(x) + b(;y)-b(x,y) 
= u+(x) + b(-,y)-b(x,y) 

on X, whence spt 7 C d b u+. From here we deduce L(u+) < L(u), hence u+ is itself an optimal 
strategy for the principal. 

Finally, v : Y 1 — > R U {+00} is an equilibrium price menu in Carlier's reformulation [13] if 
and only if u := v b minimizes L{u) on Uq, in which case U- < u < u+ throughout X implies 
u b + < (v b ) b < u_ throughout Y. Moreover, u- = u+ on spt/i implies u , = u b _ on spt^, since 
Vb,v{x) G d b u±(x) for /x-a.e. x implies u± (yb,v{%)) = b(x,yb }V (x)) — u±(x). We therefore conclude 
that if v is an equilibrium price menu, then v > (v b ) b > u b + 011 Y, with both equalities holding 
i/-a.e. Conversely, if v : Y \ — ► RU {+00} satisfies v > u b + with equality v-a,.c, we deduce the same 
must be true for its 6-convex hull (v b ) b , the latter being the largest 6-convex function dominated 
by v. Thus (v b ) b (2/0) = c(y$) and v b G Uo and v b < u + throughout X with equality holding [i-a.e. 
If vanishes on Lipschitz hypersurfaces, then Dv b = Du + agree fi-a.e., so L(v b ) = L(u + ) and 
v b is a optimal strategy for the principal as desired. If, on the other hand, [i does not vanish on 
all Lipschitz hypersurfaces, then we may assume v is its own &*-convex hull by Remark 4.2. Any 
mountain which touches u b + from below on spt v also touches v > u b + from below at the same 
point, thus d b u b + C d b v; since v is 6-convex this is equivalent to d b u + C d b v b . This shows the 
best response of x facing price menu u b + is also one of his best responses facing price menu v: he 
cannot have a better response since his indirect utility v < u + . The constraint on the agent's 
behaviour imposed by Remark 4.2 now implies L(v b ) < L{u + )\ equality must hold since u + is one 
of the principal's optimal strategies. This confirms optimality of v b and concludes the proof of the 
theorem. □ 

To show stability of the equilibrium requires the following convergence result concerning Borel 
probability measures 3 (X x l 7 ) on the product space. 

Proposition 6.3 (Convergence of losses and mixed strategies). Suppose a sequence of triples 
{boo , Coo j Moo ) = hm^oo^j, Cj, Hi) satisfy the hypotheses of Corollary 4-6- Let Li(u) denote the net 
losses (4.5) by a principal who adopts strategy u facing data {bi,Ci, [li). If any sequence Ui ofbi-convex 
functions converge uniformly on X , then their limit 
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Furthermore, there is a unique joint measure 7^ G y(X x Yj supported in 9 boo M 00 with left mar- 
ginal Hoo, and any sequence of joint measures ji G ?(A x y) vanishing outside d bi Ui and with left 
marginal fii, must converge weakly-* to y^. 

Proof. Assume a sequence Ui — > Moo of fo^-convex functions converges uniformly on X. Topologizing 
the continuous functions C (Z) by uniform convergence, where Z = X, Y or X x Y , makes the 
transformation (b,u) 1 — ► u b given by (3.1) continuous on C(X x Y) x C(X). This fact allows 
us to take i — ► 00 in the relation bl = itj to conclude 600-convcxity of u,^ . From the semi- 
convexity (4.4) of Uoo we infer its domain of differentiability Dom Duoo exhausts X apart from 
a countable collection of Lipschitz hypcrsurfaces, which are ^too-negligible by hypothesis. Define 
the map Goo(x) = Vb^ (x, Duoo(x)) on DomDuoo- Since d b °°Uoo (~l (Dom Du^ x Y) coincides with 
the graph of Goo, any measure 700 supported in d b ' x 'u 00 with left marginal ^00 is given (6.6) by 
7oc := {id x Goo)#Moo as in, e.g., Lemma 2.1 of Ahmad ct al [1]. This specifies 700 uniquely. 

Now suppose 7; > is a sequence of measures supported in d bi u,i having left marginal \n. Com- 
pactness allows us to extract from any subsequence of 7$ a further subsequence which converges 
weakly-* to some limit 7 G y(X x Y). Since /i^ — ^ fi^ the left marginal of 7 is given by /Iqo . More- 
over, since Ui(x) +u b i i (y) > bi(x,y) throughout XxY with equality on spt7j, uniform convergence 
of this expression yields spt 7 C d b °° . The uniqueness result of the preceding paragraph then 
asserts 7 = 700 independently of the choice of subsequence, so the full sequence 7; — ^ 700 converges 
weakly-*. 

Finally, use the measurable selection Dui{x) G dui{x) of Remark 4.2 to extend Dui{x) from 
DomDui to X so as to guarantee that Gi(x) := ybi(x, Dui{x))) G d bi Ui(x). Use the Borel map 
Gi : X 1 — ► y to push fii forward to the joint probability measure 7$ := (id x Gi)#/ii on X x Y 
defined by 

(6.6) ji[U xV}:= fii[Ur\Gr\V)] 

for each Borel U x V C X x Y. Notice 7$ is supported in d bi Ui and has \ii for its left marginal, 
hence converges weakly-* to 700. Moreover, our choice of measurable selection guarantees that the 
net losses (4.5) of the principal choosing strategy Ui coincide with 

(6.7) Li(ui)= _(a(y) - u]" {y))d-yi(x,y). 

JXxY 

Weak-* convergence of the measures 7, — 1 700 couples with uniform convergence of the integrands 
to yield the desired limit 

lim Li(ui) = / (coo(y) - u c S'(y))d'y 0o (x,y) = L^u^) 

JXxY 

and establish the proposition. □ 

Proof of Corollary 4-6. Let IAq denote the space of 6^-convex functions «(■) > bi(-,y^) — Ci(yg), and 
Li(u) denote the net loss of the principal who chooses strategy u facing the triple (bi,Ci, fj,i). The 
Lj-minimizing strategies Ui G Uq are Lipschitz and semiconvex, with upper bounds (4.4) on \Dm\ 
and —D 2 Ui which are independent of i since \\bi — 600 \\c 2 ~ * 0. The Ascoli-Arzela theorem therefore 
yields a subsequence UiU) which converges uniformly to a limit u on the compact set X. Since 
the functions Ui have a semiconvexity constant independent of i, it is a well-known corollary that 
their gradients also converge Du^ix) — > Du(x) pointwise on the set of common differentiability 
(Dom Du) fl Dom Dm). This set exhausts A" up to a countable union of Lipschitz hypcrsurfaces 
- which is ^too-negligible by hypothesis. Setting Gi(x) = y^ix, Dui(x)) 1 it is not hard to deduce 
yb ao (x, Du(x)) = linij^oo Gi(j)(x) on this set from Definition 4.1. If we can now prove u minimizes 
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Loo (u) on , the uniqueness of equilibrium product selected by /ioo-a.e. agent x £ X in Theorem 4.5 
will then imply that lim^oo G^j^x) = Goo{x) converges to a limit independent of the subsequence 
chosen, hence the full sequence Gi(x) converges /ioo-a.e. 

To see that u minimizes L^u) on Uq°, observe u £ U§° implies ir-» b< £ Uq is Li-feasible, 
being the ^-transform of a price menu u & =° (•) agreeing with Coo(-) at yg. Moreover, u b °° bi — > u b °° bo ° 
uniformly as i — > oo (by continuity of the 6-transform asserted in the first paragraph of the preceding 
proof). The optimality of Ui therefore yields Lj(uj) < Li(u b °° bi ). Proposition 6.3 allows us to 
deduce L^u) < L^u) by taking the subsequential limit j — > oo. Since the same proposition 
asserts fo^-convexity of u, we find u £ is the desired minimizer after taking the limit j — » oo 
in > bi(j)(-,y$) — c^j^y®). This concludes the proof of /Ltoo-a.e. convergence of the maps 

Goo(x) = lim^oo Gi(x). 

Turning to the optimal measures: as in the preceding proof, a measurable selection Dui(x) £ 
dui(x) consistent with the tie-breaking hypotheses of Remark 4.2 may be used to extend the Borcl 
map Gi(x) = yb(x, Dui(x)) from DomDui to X and define a joint measure ji :~ (id x Gi)#\ii 
supported on d bi Ui as in (6.6). The left marginal of % is obviously given by /i;, and its right 
marginal coincides with the unique optimal measure Vi given by Theorem 4.5. Proposition 6.3 then 
yields weak-* convergence of 7, — 1 700 and hence of i>i v x . Theorem 4.5 also asserts the two 
minimizers = u agree /Xoo-a.e. In this case the uniform limit u is independent of the Ascoli-Arzela 
subsequence, hence we recover convergence of the full sequence Ui — > u,^ in L°°(X,dfi 00 ) . □ 

Finally, let us extend Armstrong's desirability of exclusion to our model. Our proof is inspired 
by Armstrong's [3], but differs from his in a number of ways. 

Proof of Theorem 4-7. Use the C7 3 -smooth diffeomorphism x 6 X 1 — ► p = D y b(x,y$) £ X Vti pro- 
vided by (B0)-(B2) and its inverse p £ X y<t 1 — > x = Xb(y®,p) £ X to rcparameterize the space of 
agents over the strictly convex set X yei . Then u(p) := u(x b (y^,p)) — b(xb{yit,'p),y^) + c(y%) defines 
a non-negative 6-convex function, where b(p,y) := b(xb{y$,p),y) — b(xb{y%,p), yq>) + c(yq)). In other 
words, the space Uq corresponds to the space IAq of non-negative 6-convex functions on X yi} in the 
new parameterization. This subtraction of the reservation utility from the preference function does 
not change any agent's response to a price menu v offered by the principal, since preferences between 
different agent types are never compared. However, it does make the preference function b(p, y) a 
convex function of p £ X yi} , as is easily seen by interchanging the roles of x and y in Lemma 6.1. 
The indirect utility u(p) — v b (p) is then also convex, being a supremum (3.1) of such preference 
functions. 

In the new variables, the distribution of agents f(p)dp = f(x)dx is given by f(j>) = f(xb(y®,p)) det[dxl(y$, p)/dpj]. 
The principal's net losses L{u) = L{u) are given as in (4.5) by 

L(u)=j a(Du(p),u(p),p)f(p)dp, 

where a(q,s,p) = c(yi(p,q)) — b(p,yi(p,q)) + s is a convex function of g on Yj, := D p b(p,Y) for 
each fixed p and s, according to Proposition 4.3; (recall that b £ C 3 (X y „ x Y) satisfies the same 
hypotheses (B0)-(B3) as b £ C 4 (A X Yj, except for the possibitity that four continuous derivatives 
with respect to variables in X yfl fail to exist, which is irrelevant as already discussed). This convexity 
implies 

a(q,s,p) > a(q Q ,s,p) + (D q a(q , s,p), q - q Q ) 

for all q,qo £ Y p . With p still fixed, the choice qo = D p b(p,y$) = shows a(0,s,p) = s whence 
a(q, s,p) > (D g a(0, s,p), q) for s = u(x) > 0. 
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Now suppose u £ Uq minimizes L(u). For e > 0, define the continuously increasing family of 
compact convex sets U t := {p £ X y!l \ u(p) < e}. Observe that t/o must be non-empty, since 
otherwise for e > small enough U e would be empty, and then u — e G Uq is a better strategy, 
reducing the principal's losses by e. We now claim the interior of the set Uq — which corresponds to 
agents who decline to participate — contains a non-zero fraction of the total population of agents. 
Our argument is inspired by the strategy Armstrong worked out in a special case [3], which was to 
show that unless this conclusion is true, the profit the principal extracts from agents in U e would 
vanish at a higher order than e > 0, making u e := max{u — e,0} G Uq sl better strategy than u for 
the principal when e is sufficiently small. 

For e > 0, the contribution of U e to the principal's profit is given by 

-L e (u) := - / d{Du{p),u(p),p)f(p)dp 

< - [ (D q a(0,u(p),p),Du(p))f(p)dp 
(6.8) = [ 5(p)Vp-(/(p)D 9 o(0,fi(p),p))dp- / u(p)(D q ~a,h)f(p)dS( P ) 

where n = hfj (p) denotes the outer until normal to U e at p, and the divergence theorem has been 
used. Here dU e denotes the boundary of the convex set U e , and dS{p) denotes the n — 1 dimensional 
surface (i.e. Hausdorff) measure on this boundary. (For Sobolev functions, the integration by parts 
formula that we need is contained in §4.3 of [17] under the additional restriction that the vector field 
u(-)D q a(0, u(-), ■) be C 1 smooth, but extends immediately to Lipschitz vectors fields by approxima- 
tion; the operation of restricting / to the boundary of U e is there shown to give a bounded linear 
map from W 1 ' 1 {U <L , dp) to I/ 1 (9/7 e , dS) called the boundary trace.) As e — > 0, we claim both integrals 
in (6.8) vanish at rate o(e) if the interior of Uq is empty. To see this, note « = eon dU e nint-Xj,., so 

u(p)(D q d,n) f(p)dS(p) 

du. 



r 



(D q a, n)f(p)dS(p) + / [u(p) - e] (D q a, n)f(p)dS(p) 

dU, JdU <l r\dX va 



= e V p -(f(p)D q a(0,u(p),p))dp+ [fi(p) - e](D q a,n) f(p)dS(p). 

Ju c JO c ndx vei 

Since < u < e in U e , we combine the last inequality with (6.8) to obtain 

(6.9) < / \V p -(f(p)D q a(0,u(p),p))\dp+ [ \(D q a,n)f(p)\dS(p). 



Notice that domain monotonicity implies the e — > limit of the last expressions above is given by 
integrals over the limiting domain Uq = n c >ot/ e . Assume now the interior of the convex set Uq is 
empty, so that Uq has dimension at most n — 1. Then the volume \U e \ = o(l), hence the first integral 
in the right hand side dwindles to zero as e — > 0, (recalling that u is Lipschitz, / G W 1 ' 1 and a G C 3 ). 
Concerning the second term, if the convex set Uq has dimension n — 1 then its relative interior must 
be disjoint from the boundary of the convex body X yi} , since the latter is assumed to have no n — 1 
dimensional facets. Either way Uq H dX yil has dimension at most n — 2, which implies that 

/ dS( P ) = o(l) 

Ju,ndx va 
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as e — > 0. All in all, we have shown L e (u) = o(e) as e — > whenever Uq has empty interior, which 
— as was explained above — contradicts the asserted optimality of the strategy u. However, even 
if Uo has non-empty interior, more must be true to avoid inferring the contradictory conclusion 
L e (u) = o(e) as e — > from (6.9): one of the two limiting integrals 

f \V p -(f(p)D q a(O,u(p),p))\dp>0 or f \(D q ~a,n)\f(p)dS(p)>0 
JU Ju ndx yi 

must be non-vanishing. In cither case, the W ' density / must be positive somewhere in Uo, whose 
interior therefore includes a positive fraction of the agents. Since u is diffcrentiable with vanishing 
gradient on the interior of Uo, there is no ambiguity in the strategy of these agents: they respond 
to u by choosing the null product. □ 

Appendix A. Minimal 6-convex potentials 

The purpose of this appendix is to establish a mathematical result (and some terminology) needed 
in the last part of the uniqueness proof, Theorem 4.5. In particular, we establish a minimality 
property enjoyed by Rochet's construction of a 6-convex function for which d b u contains a prescribed 
set [49]; Rochet's construction is modeled on the analogous construction by Rockafellar of a convex 
function u whose subdiffcrcntial du contains a given cyclically monotone set [52] . 

Recall a relation S C X x Y is b- cyclically monotone if for each integer k £ N and fc-tuplc of 
points (xi,yi), . .. ,(x k ,yk) £ S, the inequality 

k 

(A.l) ^ b(Xj, yi) - b(xj+i,yj) > 

i=l 

holds with xu+i := x\. For a function u : X i — ► R U {+cx)}, the relation d b u C X x Y consists of 
those points (a;, y) such that 

(A.2) u(-)>u(x) + b(;y)-b(x,y) 

holds throughout X. Rochet's generalization of Rockafellar's theorem asserts that S C X x Y is 
6-cyclically monotone if and only if there exists a &-convex function u : X i — > R U {+00} such that 
S C d b u; see also [21] [31] [56]. Here we need to extract a certain minimality property from its proof. 

Lemma A.l. Given a b-cyclically monotone S C X x Y and {xo,yo) £ S, there is a b-convex 
function u vanishing at xo and satisfying S C d b u, which is minimal in the sense that u < u for all 
u : X 1 — > R U {+00} vanishing at Xq with S C d b u. 

Proof. Given a 6-cyclically monotone S C XxY and (xq, yo) £ S, Rochet [49] verified the elementary 
fact that the following formula defines a 6-convex function u for which S C d b u: 

k 

(A. 3) u(-) = sup sup b{-,y k ) -b{x ,y a ) + y^b(xi,yi-{) -b(xi,yi). 

keN (x 1 ,y 1 ),...,(x k ,y k )£S i=1 

Taking k = shows u(xq) > 0, while the opposite inequality u(xq) < follows from 6-cyclical 
monotonicity (A.l) of S. Now suppose u(xq) = and S C d b u. For each k £ N and fc-tuplc in S, we 
claim «(■) exceeds the expression under the suprcmum in (A. 3). Indeed, (xi,yi) £ S C d b u implies 

u(x i+ i) > u(xi) + b(x i+1 ,yi) - b(xi,yi). 

and u{xi) < 00, by evaluating (A.2) at Xi and at xq- Summing the displayed inequalities from 
i = {),... ,k, arbitrariness of Xk+i £ X yields the desired result: u(xk+i) > u(xk+i)- □ 
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